DEVELOPERS' WORKSHOP: Welcome to the Cow...Debugging Device Drivers By Victor Tsou

"Developers' Workshop" is a new weekly feature that provides answers to our developers' questions. Each week, a Be technical support or documentation professional will choose a question (or two) sent in by an actual developer and provide an answer. 

We've created a new section on our web site. Please send us your Newsletter topic suggestions by visiting the web site at: http://www.be.com/developers/suggestion_box.html. 
  
On my way home last night, a figure stepped out of the shadows and rhythmically rapped into my ear: 
Question: Why is it that every time my device driver runs, my system hangs? Fuuuunnnk daaaat! 
Development behind the 2 GB iron curtain is often difficult, so the kernel provides several functions to help you track down bugs in your device driver or kernel add-on. You can follow along at home in the "Exported Kernel Functions" section of the "Device Drivers" chapter in the online Be Book.

The most primitive debugging function is dprintf(), which squirts formatted text through the serial port, much like the SERIAL_PRINT() macro of Support Kit fame. Unlike SERIAL_PRINT(), dprintf() is accessible from kernel space. As always, serial communication occurs through /dev/serial1 on x86, /dev/serial4 on the BeBox, and /dev/modem on the Mac with data parameters 19200 N81.

No doubt you've been meaning to dust off your VT52 anyway. 

If nothing comes out of the serial port, serial output is probably disabled. There are a few ways of turning it on, including: 

Holding down the delete key on Macs or the F1 key on x86 or BeBox machines during bootup. 
Calling set_dprintf_enabled(true), as detailed in the Be Book. 

Once you've established a serial debugging connection, you may find yourself a frequent guest of the kernel debugger. The kernel debugger is typically triggered by an exception in kernel space. 

You can also programmatically enter it with kernel_debugger(). 

The kernel debugger, for the most part, presents a read-only snapshot of the universe. This tool possesses limited poking abilities, but there is currently no way of modifying register values and breakpoints short of wading through stack frames and writing over code. Typing "help" will give you a list of the debugger's capabilities. Don't bother to RTFM; that's all the M that's available for now (this will be rectified in the future).

The debugger understands symbols found in xMAP files, which is helpful in deciphering stack traces. Every aspiring kernel driver writer should copy the kernel's xMAP from the installation CD to /system. The kernel can be instructed to load your driver's symbols with load_driver_symbols(). This function searches for a specific xMAP in the drivers, file_systems, pnp, and cam subdirectories of the kernel add-ons directories.

In large functions, it's often difficult to identify the precise source line triggering the exception. 

Fortunately, the -g and -machinecodelist mwcc options working in tandem can provide you with an enlightening interleaved assembly and source view of your code. Its contents will let you locate the crash in no time at all.

Look for kernel debugger improvements in R4, including breakpoints, tracing, and improved register shadowing (how many times have you wanted to do 'dis eip' or 'esp += 8'?). These commands and more will help ease the burden of locating bugs and can be used to avert impending crashes through judicious stack and pc manipulation.

In fine Hiroshi fashion, here are three insightful yet unrelated techniques for faster driver development: 

ASSERT() - Although there is no predefined ASSERT() macro for kernel drivers, there's nothing to prevent you from selfishly cobbling one together for your personal use: 


assert.h: 
#ifndef DEBUG 
#define ASSERT(c) 0 
#else 
int _assert_(char *,int,char *); 
#define ASSERT(c) (!(c) ? _assert_(__FILE__,__LINE__,#c) : 0) 
#endif 

assert.c: 
#ifdef DEBUG 

int _assert_(char *a, int b, char *c) 
{ 
    dprintf("tripped assertion in %s/%d (%s)\n", a, b, c); 
    kernel_debugger("tripped assertion"); 
    return 0; 
} 

#endif 

ioctl() - If the driver/add-on protocol includes an ioctl() facility, use it as a runtime debugging aid. For example, ioctls may be defined to print out important data structures or verify data integrity. Here's a short program you can modify to issue human-readable ioctl() commands to your driver. 

#include <fcntl.h> 
#include <stdio.h> 
#include <stdlib.h> 
#include <string.h> 
#include <unistd.h> 

struct cmds { 
    char *string; 
    int code; 
} commands[] = { 
  /* replace with your ioctls */ 
    { "dumpinfo", 10000 }, 
    { "verifyintegrity", 10001 }, 
    { "simulateerror", 10002 }, 
    { "reset", 10003 }, 
    { NULL, 0 } 
}; 

static void print_help() 
{ 
    int i; 
    printf("usage: ioctl command files...\n"); 
    printf("commands: %s", commands[0]); 
    for (i=1;commands[i].string != NULL;i++) 
        printf(", %s", commands[i]); 
    printf("\n"); 
    exit(-1); 
} 

int main(int argc, char **argv) 
{ 
    int fd, i, code; 

    if (argc < 3) print_help(); 

    for (i=0;commands[i].string;i++) { 
        if (!strcasecmp(commands[i].string, argv[1])) { 
            code = commands[i].code; 
            break; 
        } 
    } 

    if (commands[i].string == NULL) { 
        for (i=0;argv[1][i];i++) 
            if ((argv[1][i] < '0') || (argv[1][i] > '9')) 
                break; 
        if (argv[1][i]) 
            print_help(); 
        code = atoi(argv[1]); 
    } 

    for (i=2;i<argc;i++) { 
        if ((fd = open(argv[i], O_RWMASK)) < 0) { 
            printf("error opening %s (%s)\n", argv[i], 
                   strerror(fd)); 
            continue; 
        } 
        ioctl(fd, code); 
        close(fd); 
    } 

    return 0; 
} 

add_debugger_command() - ioctl() is nice and all, but you'll soon lust for some of that post-mortem loving.  Fortunately, the kernel debugger allows you to hook in new commands with add_debugger_command(). This function registers a callback with the kernel debugger that is called with main()-style argc/argv arguments.

When writing a kernel debugger command, remember that it may be called while the kernel is in an unpredictable state. This means malloc() and friends are off limits, as they may induce swapping. dprintf() is also a no-no; use the stripped down equivalent kprintf() instead. 

Since added kernel debugger commands normally lie in the driver's memory space, the kernel debugger runs into problems involving accessing unallocated memory when the driver is unloaded. Drivers should therefore remove added commands with remove_debugger_command() (new for R3.1) when they are unloaded.

Want code? 

int do_echo(int argc, char **argv) 
{ 
  int i; 
  if (argc == 1) { 
    kprintf("echo <args> - prints arguments\n"); 
    return 0; 
  } 
  for (i=1;i<argc;i++) 
    kprintf("%s\n", argv[i]); 
  return 0; 
} 

status_t init_driver() 
{ 
  add_debugger_command("echo", do_echo, 
    "echo <args> - prints arguments); 
  ... 
} 

status_t uninit_driver() 
{ 
  remove_debugger_command("echo", do_echo); 
  ... 
}